Overview

Dataset statistics

Number of variables12
Number of observations2172
Missing cells429
Missing cells (%)1.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory203.8 KiB
Average record size in memory96.1 B

Variable types

Numeric11
Categorical1

Warnings

Year has constant value "0.0" Constant
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with HIGH_2 and 2 other fieldsHigh correlation
LF_4 is highly correlated with HIGH_6High correlation
HIGH_2 is highly correlated with Houses - median sale price ($) and 3 other fieldsHigh correlation
HIGH_4 is highly correlated with Houses - median sale price ($) and 2 other fieldsHigh correlation
HIGH_5 is highly correlated with Houses - median sale price ($) and 3 other fieldsHigh correlation
HIGH_6 is highly correlated with LF_4 and 2 other fieldsHigh correlation
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with HIGH_2 and 3 other fieldsHigh correlation
LF_3 is highly correlated with LF_4High correlation
LF_4 is highly correlated with LF_3High correlation
HIGH_2 is highly correlated with Houses - median sale price ($) and 3 other fieldsHigh correlation
HIGH_4 is highly correlated with Houses - median sale price ($) and 2 other fieldsHigh correlation
HIGH_5 is highly correlated with Houses - median sale price ($) and 3 other fieldsHigh correlation
HIGH_6 is highly correlated with Houses - median sale price ($) and 2 other fieldsHigh correlation
df_index is highly correlated with ASGS_2016High correlation
ASGS_2016 is highly correlated with df_indexHigh correlation
Houses - median sale price ($) is highly correlated with HIGH_2 and 1 other fieldsHigh correlation
HIGH_2 is highly correlated with Houses - median sale price ($) and 2 other fieldsHigh correlation
HIGH_4 is highly correlated with HIGH_2 and 1 other fieldsHigh correlation
HIGH_5 is highly correlated with Houses - median sale price ($) and 2 other fieldsHigh correlation
LF_4 is highly correlated with HIGH_6 and 1 other fieldsHigh correlation
HIGH_3 is highly correlated with HIGH_5 and 5 other fieldsHigh correlation
HIGH_6 is highly correlated with LF_4 and 3 other fieldsHigh correlation
HIGH_5 is highly correlated with HIGH_3 and 6 other fieldsHigh correlation
HIGH_4 is highly correlated with HIGH_3 and 5 other fieldsHigh correlation
HIGH_2 is highly correlated with LF_4 and 7 other fieldsHigh correlation
df_index is highly correlated with HIGH_3 and 5 other fieldsHigh correlation
Houses - median sale price ($) is highly correlated with HIGH_3 and 5 other fieldsHigh correlation
ASGS_2016 is highly correlated with HIGH_3 and 5 other fieldsHigh correlation
HIGH_7 is highly correlated with ASGS_2016High correlation
Houses - median sale price ($) has 388 (17.9%) missing values Missing
HIGH_7 has 40 (1.8%) missing values Missing
df_index has unique values Unique
ASGS_2016 has unique values Unique

Reproduction

Analysis started2021-08-19 05:15:40.276183
Analysis finished2021-08-19 05:16:03.689680
Duration23.41 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct2172
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0
Minimum-1.720800851
Maximum1.783358168
Zeros0
Zeros (%)0.0%
Negative1093
Negative (%)50.3%
Memory size17.1 KiB

Quantile statistics

Minimum-1.720800851
5-th percentile-1.55171103
Q1-0.8615859539
median-0.01001871973
Q30.8522552422
95-th percentile1.571441436
Maximum1.783358168
Range3.504159019
Interquartile range (IQR)1.713841196

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)nan
Kurtosis-1.177481652
Mean0
Median Absolute Deviation (MAD)0.8573029812
Skewness0.02138564561
Sum0
Variance1.000460617
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.62703157761
 
< 0.1%
1.1654270271
 
< 0.1%
-1.4332487361
 
< 0.1%
1.040005361
 
< 0.1%
0.85340239161
 
< 0.1%
0.35171572281
 
< 0.1%
0.97117639611
 
< 0.1%
-0.84590824551
 
< 0.1%
1.1746042221
 
< 0.1%
0.37312917821
 
< 0.1%
Other values (2162)2162
99.5%
ValueCountFrequency (%)
-1.7208008511
< 0.1%
-1.7192713181
< 0.1%
-1.7177417861
< 0.1%
-1.7162122531
< 0.1%
-1.7146827211
< 0.1%
-1.7131531881
< 0.1%
-1.7116236561
< 0.1%
-1.7100941231
< 0.1%
-1.7085645911
< 0.1%
-1.7070350581
< 0.1%
ValueCountFrequency (%)
1.7833581681
< 0.1%
1.7818286351
< 0.1%
1.7802991031
< 0.1%
1.778769571
< 0.1%
1.7757105051
< 0.1%
1.7741809731
< 0.1%
1.7680628421
< 0.1%
1.7650037771
< 0.1%
1.7634742451
< 0.1%
1.7619447121
< 0.1%

ASGS_2016
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct2172
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2.093680253 × 10-16
Minimum-1.115755843
Maximum3.065573447
Zeros0
Zeros (%)0.0%
Negative1387
Negative (%)63.9%
Memory size17.1 KiB

Quantile statistics

Minimum-1.115755843
5-th percentile-1.089674907
Q1-0.9746887552
median-0.05482193222
Q30.4731101426
95-th percentile2.02551762
Maximum3.065573447
Range4.181329291
Interquartile range (IQR)1.447798898

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)-4.777378402 × 1015
Kurtosis0.2524799437
Mean-2.093680253 × 10-16
Median Absolute Deviation (MAD)0.5382546525
Skewness0.9872188556
Sum-4.547473509 × 10-13
Variance1.000460617
MonotonicityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.013124875451
 
< 0.1%
0.47311015181
 
< 0.1%
-0.53022519011
 
< 0.1%
-0.54067922411
 
< 0.1%
-0.54083610371
 
< 0.1%
-0.53022568661
 
< 0.1%
2.5429209461
 
< 0.1%
-0.97986338281
 
< 0.1%
2.0253372031
 
< 0.1%
-1.0895180361
 
< 0.1%
Other values (2162)2162
99.5%
ValueCountFrequency (%)
-1.1157558431
< 0.1%
-1.1157558381
< 0.1%
-1.1157558331
< 0.1%
-1.1157558271
< 0.1%
-1.1157558221
< 0.1%
-1.1157558171
< 0.1%
-1.1157035461
< 0.1%
-1.1157035411
< 0.1%
-1.1157035361
< 0.1%
-1.1157035311
< 0.1%
ValueCountFrequency (%)
3.0655734471
< 0.1%
3.0655211771
< 0.1%
3.0654689061
< 0.1%
3.0654166361
< 0.1%
2.5432869191
< 0.1%
2.5432346481
< 0.1%
2.5432346271
< 0.1%
2.5431822311
< 0.1%
2.5431822261
< 0.1%
2.5431822211
< 0.1%

Year
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size127.4 KiB
0.0
2172 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6516
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.02172
100.0%

Length

Histogram of lengths of the category

Pie chart

ValueCountFrequency (%)
0.02172
100.0%

Most occurring characters

ValueCountFrequency (%)
04344
66.7%
.2172
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4344
66.7%
Other Punctuation2172
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04344
100.0%
Other Punctuation
ValueCountFrequency (%)
.2172
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common6516
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
04344
66.7%
.2172
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII6516
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04344
66.7%
.2172
33.3%

Houses - median sale price ($)
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct907
Distinct (%)50.8%
Missing388
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean4.779435442 × 10-17
Minimum-1.232746921
Maximum8.203826913
Zeros0
Zeros (%)0.0%
Negative1166
Negative (%)53.7%
Memory size17.1 KiB

Quantile statistics

Minimum-1.232746921
5-th percentile-0.9099667295
Q1-0.5972385163
median-0.239834844
Q30.2245107083
95-th percentile1.916537875
Maximum8.203826913
Range9.436573834
Interquartile range (IQR)0.8217492246

Descriptive statistics

Standard deviation1.000280387
Coefficient of variation (CV)2.09288398 × 1016
Kurtosis14.47398449
Mean4.779435442 × 10-17
Median Absolute Deviation (MAD)0.3797414018
Skewness3.054097813
Sum8.526512829 × 10-14
Variance1.000560852
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.641913975315
 
0.7%
-0.351523491615
 
0.7%
-0.597238516315
 
0.7%
-0.507887598214
 
0.6%
-0.452043274414
 
0.6%
-0.60840738113
 
0.6%
-0.753602622913
 
0.6%
-0.19515938511
 
0.5%
-0.284510303111
 
0.5%
-0.228665979311
 
0.5%
Other values (897)1652
76.1%
(Missing)388
 
17.9%
ValueCountFrequency (%)
-1.2327469211
< 0.1%
-1.1891883481
< 0.1%
-1.1852792461
< 0.1%
-1.1724350511
< 0.1%
-1.1668506191
< 0.1%
-1.1467466621
< 0.1%
-1.1355777982
0.1%
-1.1344609111
< 0.1%
-1.11882451
< 0.1%
-1.1165907281
< 0.1%
ValueCountFrequency (%)
8.2038269131
< 0.1%
8.0698005362
0.1%
7.6230459461
< 0.1%
6.7295367651
< 0.1%
5.9533006641
< 0.1%
5.7913521251
< 0.1%
5.0542070511
< 0.1%
4.8587519181
< 0.1%
4.7638165681
< 0.1%
4.6074524611
< 0.1%

LF_3
Real number (ℝ)

HIGH CORRELATION

Distinct802
Distinct (%)36.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.925650474 × 10-17
Minimum-1.349623832
Maximum9.215684801
Zeros0
Zeros (%)0.0%
Negative1270
Negative (%)58.5%
Memory size17.1 KiB

Quantile statistics

Minimum-1.349623832
5-th percentile-1.100762173
Q1-0.7764878893
median-0.2410582583
Q30.5026987467
95-th percentile1.949678471
Maximum9.215684801
Range10.56530863
Interquartile range (IQR)1.279186636

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)2.547935148 × 1016
Kurtosis4.573269343
Mean3.925650474 × 10-17
Median Absolute Deviation (MAD)0.6033009926
Skewness1.494104742
Sum8.526512829 × 10-14
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.403195400111
 
0.5%
-0.440901712111
 
0.5%
-0.648286428410
 
0.5%
-0.753864102110
 
0.5%
-0.776487889310
 
0.5%
-0.923542506310
 
0.5%
-0.938625031110
 
0.5%
-1.0328908119
 
0.4%
-1.059285239
 
0.4%
-1.0856796489
 
0.4%
Other values (792)2073
95.4%
ValueCountFrequency (%)
-1.3496238324
0.2%
-1.3383119391
 
< 0.1%
-1.3345413081
 
< 0.1%
-1.3270000451
 
< 0.1%
-1.311917522
0.1%
-1.3006056272
0.1%
-1.2930643642
0.1%
-1.2892937331
 
< 0.1%
-1.2855231021
 
< 0.1%
-1.2779818392
0.1%
ValueCountFrequency (%)
9.2156848011
< 0.1%
4.7776518741
< 0.1%
4.5476433711
< 0.1%
4.5250195841
< 0.1%
4.4232125411
< 0.1%
4.230910351
< 0.1%
4.1969746691
< 0.1%
4.012213741
< 0.1%
3.9481130091
< 0.1%
3.7972877611
< 0.1%

LF_4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct167
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2.878810347 × 10-16
Minimum-1.614003279
Maximum12.92939607
Zeros0
Zeros (%)0.0%
Negative1325
Negative (%)61.0%
Memory size17.1 KiB

Quantile statistics

Minimum-1.614003279
5-th percentile-0.9581244848
Q1-0.5874103838
median-0.1881798134
Q30.2966001649
95-th percentile1.551324815
Maximum12.92939607
Range14.54339935
Interquartile range (IQR)0.8840105487

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)-3.47445702 × 1015
Kurtosis37.6204057
Mean-2.878810347 × 10-16
Median Absolute Deviation (MAD)0.4277470397
Skewness4.366645876
Sum-6.252776075 × 10-13
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.216696282752
 
2.4%
-0.159663344148
 
2.2%
-0.416311567948
 
2.2%
-0.501860975845
 
2.1%
-0.530377445144
 
2.0%
-0.644443322444
 
2.0%
-0.444828037243
 
2.0%
0.0399519411243
 
2.0%
-0.587410383842
 
1.9%
-0.330762159941
 
1.9%
Other values (157)1722
79.3%
ValueCountFrequency (%)
-1.6140032791
 
< 0.1%
-1.556970341
 
< 0.1%
-1.5284538711
 
< 0.1%
-1.4714209322
0.1%
-1.4429044631
 
< 0.1%
-1.4143879942
0.1%
-1.3858715251
 
< 0.1%
-1.3573550553
0.1%
-1.3288385863
0.1%
-1.3003221174
0.2%
ValueCountFrequency (%)
12.929396071
< 0.1%
10.961759691
< 0.1%
10.904726751
< 0.1%
10.191815021
< 0.1%
8.7659915511
< 0.1%
6.5702234141
< 0.1%
6.4276410672
0.1%
6.1139599051
< 0.1%
5.885828151
< 0.1%
5.2584658251
< 0.1%

HIGH_2
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct573
Distinct (%)26.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.05650549 × 10-16
Minimum-2.550450043
Maximum2.844077604
Zeros0
Zeros (%)0.0%
Negative1133
Negative (%)52.2%
Memory size17.1 KiB

Quantile statistics

Minimum-2.550450043
5-th percentile-1.426872139
Q1-0.851546104
median-0.06301100873
Q30.7745960131
95-th percentile1.703239919
Maximum2.844077604
Range5.394527648
Interquartile range (IQR)1.626142117

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)2.465743692 × 1015
Kurtosis-0.8914402206
Mean4.05650549 × 10-16
Median Absolute Deviation (MAD)0.8088407201
Skewness0.2056678441
Sum8.810729923 × 10-13
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.675564022711
 
0.5%
0.0689755522911
 
0.5%
-1.19674172511
 
0.5%
-0.932768603111
 
0.5%
-0.90569443689
 
0.4%
1.0842567919
 
0.4%
-1.0342967279
 
0.4%
-0.56049881569
 
0.4%
-0.51311902459
 
0.4%
-0.97337985279
 
0.4%
Other values (563)2074
95.5%
ValueCountFrequency (%)
-2.5504500431
< 0.1%
-2.3135510881
< 0.1%
-2.2661712961
< 0.1%
-2.2323285891
< 0.1%
-2.1984858811
< 0.1%
-2.1917173391
< 0.1%
-2.1714117141
< 0.1%
-2.1646431731
< 0.1%
-2.1511060891
< 0.1%
-2.1172633811
< 0.1%
ValueCountFrequency (%)
2.8440776041
 
< 0.1%
2.5665673991
 
< 0.1%
2.4244280261
 
< 0.1%
2.2146032362
0.1%
2.1942976121
 
< 0.1%
2.1672234451
 
< 0.1%
2.1536863621
 
< 0.1%
2.1333807371
 
< 0.1%
2.1063065713
0.1%
2.0860009461
 
< 0.1%

HIGH_3
Real number (ℝ)

HIGH CORRELATION

Distinct211
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1.57026019 × 10-16
Minimum-1.939224627
Maximum3.22632218
Zeros0
Zeros (%)0.0%
Negative1362
Negative (%)62.7%
Memory size17.1 KiB

Quantile statistics

Minimum-1.939224627
5-th percentile-1.22755213
Q1-0.6760341412
median-0.270008628
Q30.4743714796
95-th percentile2.053359587
Maximum3.22632218
Range5.165546807
Interquartile range (IQR)1.150405621

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)-6.36983787 × 1015
Kurtosis0.5998284352
Mean-1.57026019 × 10-16
Median Absolute Deviation (MAD)0.518810378
Skewness1.060748243
Sum-3.410605132 × 10-13
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.382793492739
 
1.8%
-0.405350465736
 
1.7%
-0.473021384636
 
1.7%
-0.427907438734
 
1.6%
-0.337679546834
 
1.6%
-0.157223763234
 
1.6%
-0.518135330534
 
1.6%
-0.653477168233
 
1.5%
-0.585806249433
 
1.5%
-0.630920195333
 
1.5%
Other values (201)1826
84.1%
ValueCountFrequency (%)
-1.9392246271
 
< 0.1%
-1.736211871
 
< 0.1%
-1.6910979241
 
< 0.1%
-1.6685409511
 
< 0.1%
-1.6234270051
 
< 0.1%
-1.6008700321
 
< 0.1%
-1.5557560873
 
0.1%
-1.5331991143
 
0.1%
-1.5106421411
 
< 0.1%
-1.4655281958
0.4%
ValueCountFrequency (%)
3.226322181
 
< 0.1%
3.1812082351
 
< 0.1%
3.1586512622
0.1%
3.1360942892
0.1%
3.068423371
 
< 0.1%
3.0458663971
 
< 0.1%
3.0233094242
0.1%
2.9781954784
0.2%
2.9556385053
0.1%
2.9330815323
0.1%

HIGH_4
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct357
Distinct (%)16.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-5.234200632 × 10-17
Minimum-2.392584478
Maximum2.63359091
Zeros0
Zeros (%)0.0%
Negative1107
Negative (%)51.0%
Memory size17.1 KiB

Quantile statistics

Minimum-2.392584478
5-th percentile-1.588876241
Q1-0.7851680053
median-0.02944235045
Q30.7982571763
95-th percentile1.613961058
Maximum2.63359091
Range5.026175387
Interquartile range (IQR)1.583425182

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)-1.910951361 × 1016
Kurtosis-0.8398612335
Mean-5.234200632 × 10-17
Median Absolute Deviation (MAD)0.7917125908
Skewness0.06053058004
Sum-1.136868377 × 10-13
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0665228120817
 
0.8%
-0.173390094216
 
0.7%
0.00654458549715
 
0.7%
-0.0294423504515
 
0.7%
0.546348624715
 
0.7%
1.11014395514
 
0.6%
0.0905141027114
 
0.6%
-0.0894205770314
 
0.6%
-0.0534336410814
 
0.6%
0.126501038714
 
0.6%
Other values (347)2024
93.2%
ValueCountFrequency (%)
-2.3925844781
 
< 0.1%
-2.2246454431
 
< 0.1%
-2.1886585071
 
< 0.1%
-2.1646672171
 
< 0.1%
-2.1286802812
0.1%
-2.0806976991
 
< 0.1%
-2.0687020542
0.1%
-2.0447107631
 
< 0.1%
-1.9967281821
 
< 0.1%
-1.9847325373
0.1%
ValueCountFrequency (%)
2.633590911
< 0.1%
2.4896431661
< 0.1%
2.4656518751
< 0.1%
2.3336997771
< 0.1%
2.3217041311
< 0.1%
2.2617259052
0.1%
2.249730261
< 0.1%
2.2377346141
< 0.1%
2.2137433241
< 0.1%
2.1777563881
< 0.1%

HIGH_5
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct129
Distinct (%)5.9%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2.618305797 × 10-17
Minimum-2.135829557
Maximum3.482352763
Zeros0
Zeros (%)0.0%
Negative1187
Negative (%)54.7%
Memory size17.1 KiB

Quantile statistics

Minimum-2.135829557
5-th percentile-1.481657643
Q1-0.7505243275
median-0.1348331143
Q30.6347809021
95-th percentile1.827682628
Maximum3.482352763
Range5.61818232
Interquartile range (IQR)1.38530523

Descriptive statistics

Standard deviation1.000230388
Coefficient of variation (CV)3.820143504 × 1016
Kurtosis-0.2743590457
Mean2.618305797 × 10-17
Median Absolute Deviation (MAD)0.6926526148
Skewness0.4994493924
Sum5.684341886 × 10-14
Variance1.000460829
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.173313815144
 
2.0%
-0.712043626643
 
2.0%
0.0190896889841
 
1.9%
-0.327236618441
 
1.9%
-0.904447130837
 
1.7%
-0.404198020137
 
1.7%
0.0575703898137
 
1.7%
-0.365717319236
 
1.7%
-0.288755917636
 
1.7%
-0.558120823436
 
1.7%
Other values (119)1783
82.1%
ValueCountFrequency (%)
-2.1358295571
 
< 0.1%
-1.9819067543
 
0.1%
-1.9434260531
 
< 0.1%
-1.9049453521
 
< 0.1%
-1.8664646511
 
< 0.1%
-1.8279839516
0.3%
-1.789503253
 
0.1%
-1.7510225497
0.3%
-1.71254184811
0.5%
-1.6740611478
0.4%
ValueCountFrequency (%)
3.4823527631
 
< 0.1%
3.3669106611
 
< 0.1%
3.328429961
 
< 0.1%
3.2899492591
 
< 0.1%
3.0205843532
0.1%
2.8281808491
 
< 0.1%
2.6742580462
0.1%
2.6357773451
 
< 0.1%
2.5972966441
 
< 0.1%
2.5588159434
0.2%

HIGH_6
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct137
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-6.54275079 × 10-18
Minimum-1.765924779
Maximum11.69777108
Zeros0
Zeros (%)0.0%
Negative1219
Negative (%)56.1%
Memory size17.1 KiB

Quantile statistics

Minimum-1.765924779
5-th percentile-1.256487638
Q1-0.7470504977
median-0.1284482556
Q30.5265423537
95-th percentile1.707344869
Maximum11.69777108
Range13.46369586
Interquartile range (IQR)1.273592851

Descriptive statistics

Standard deviation1.000230282
Coefficient of variation (CV)-1.528761089 × 1017
Kurtosis12.6913433
Mean-6.54275079 × 10-18
Median Absolute Deviation (MAD)0.6186022421
Skewness1.944662053
Sum-1.421085472 × 10-14
Variance1.000460617
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.0556715212549
 
2.3%
0.0898819474844
 
2.0%
-0.0920598884343
 
2.0%
-0.928992333642
 
1.9%
-0.2012249939
 
1.8%
-0.383166825938
 
1.7%
-0.674273763438
 
1.7%
-0.747050497738
 
1.7%
0.162658681838
 
1.7%
-0.274001724337
 
1.7%
Other values (127)1766
81.3%
ValueCountFrequency (%)
-1.7659247791
 
< 0.1%
-1.6567596773
 
0.1%
-1.620371311
 
< 0.1%
-1.5839829434
 
0.2%
-1.5475945762
 
0.1%
-1.51120620910
0.5%
-1.4748178419
0.4%
-1.43842947411
0.5%
-1.4020411078
0.4%
-1.3656527412
0.6%
ValueCountFrequency (%)
11.697771081
< 0.1%
7.7314390561
< 0.1%
7.0764484471
< 0.1%
6.9672833451
< 0.1%
5.3298068221
< 0.1%
4.9295347831
< 0.1%
3.9470488691
< 0.1%
3.9106605022
0.1%
3.6559419311
< 0.1%
3.4376117281
< 0.1%

HIGH_7
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct67
Distinct (%)3.1%
Missing40
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean-1.066480654 × 10-16
Minimum-0.6043119287
Maximum20.23051642
Zeros0
Zeros (%)0.0%
Negative1544
Negative (%)71.1%
Memory size17.1 KiB

Quantile statistics

Minimum-0.6043119287
5-th percentile-0.6043119287
Q1-0.5133301455
median-0.3313665791
Q30.03256055375
95-th percentile1.852196218
Maximum20.23051642
Range20.83482835
Interquartile range (IQR)0.5458906992

Descriptive statistics

Standard deviation1.000234604
Coefficient of variation (CV)-9.378834959 × 1015
Kurtosis88.27369855
Mean-1.066480654 × 10-16
Median Absolute Deviation (MAD)0.1819635664
Skewness6.462841454
Sum-2.273736754 × 10-13
Variance1.000469263
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.5133301455369
17.0%
-0.4223483623344
15.8%
-0.3313665791280
12.9%
-0.2403847959172
7.9%
-0.6043119287169
7.8%
-0.1494030127117
 
5.4%
-0.0584212294693
 
4.3%
0.0325605537584
 
3.9%
0.123542336958
 
2.7%
0.214524120253
 
2.4%
Other values (57)393
18.1%
(Missing)40
 
1.8%
ValueCountFrequency (%)
-0.6043119287169
7.8%
-0.5133301455369
17.0%
-0.4223483623344
15.8%
-0.3313665791280
12.9%
-0.2403847959172
7.9%
-0.1494030127117
 
5.4%
-0.0584212294693
 
4.3%
0.0325605537584
 
3.9%
0.123542336958
 
2.7%
0.214524120253
 
2.4%
ValueCountFrequency (%)
20.230516421
 
< 0.1%
7.4020849931
 
< 0.1%
6.6742307281
 
< 0.1%
6.4922671611
 
< 0.1%
6.4012853781
 
< 0.1%
6.1283400281
 
< 0.1%
5.8553946791
 
< 0.1%
5.7644128961
 
< 0.1%
5.6734311121
 
< 0.1%
5.4914675463
0.1%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexASGS_2016YearHouses - median sale price ($)LF_3LF_4HIGH_2HIGH_3HIGH_4HIGH_5HIGH_6HIGH_7
0-1.720801-1.1157560.0-0.364926-1.085680-0.787026-0.398054-0.8564900.6183220.634781-0.455944-0.513330
1-1.719271-1.1157560.0NaN-0.576644-0.644443-0.161155-0.5632490.5463490.4808580.380989-0.240385
2-1.717742-1.1157560.0NaN-0.207123-0.5018610.190809-0.879047-0.053434-0.2117950.3082120.123542
3-1.716212-1.1157560.0NaN-0.833047-0.5588940.630765-0.563249-0.173390-0.404198-0.637885-0.331367
4-1.714683-1.1157560.00.117569-0.361718-1.1577400.522468-0.788819-0.173390-0.519640-1.001769-0.513330
5-1.713153-1.1157560.0NaN-0.403195-1.0151570.799978-0.608363-0.173390-0.365717-0.783439-0.513330
6-1.711624-1.1157040.0NaN-1.172404-0.701476-1.095214-0.4053501.1341351.5967981.2543100.123542
7-1.710094-1.1157040.0NaN-0.821735-0.644443-0.858315-0.5858060.8462400.8271840.308212-0.240385
8-1.708565-1.1157040.0NaN-1.134698-0.929608-0.533425-0.6309200.7862620.827184-0.201225-0.422348
9-1.707035-1.1157040.0NaN-1.085680-1.4714210.184041-0.4279070.246457-0.481159-0.892604-0.513330

Last rows

df_indexASGS_2016YearHouses - median sale price ($)LF_3LF_4HIGH_2HIGH_3HIGH_4HIGH_5HIGH_6HIGH_7
21621.7619452.5431820.00.318608-1.146010-0.7299931.422684-1.059503-0.989094-0.942928-0.856216-0.331367
21631.7634742.5431820.0NaN-1.040432-0.6729601.903250-1.510642-1.540894-1.712542-1.438429-0.513330
21641.7650042.5431820.00.336479-1.149780-0.6444431.436221-0.946718-0.965103-0.673563-0.928992-0.513330
21651.7680632.5432350.0NaN-1.228964-1.0151571.835565-1.217401-1.420937-1.597100-1.656760-0.513330
21661.7741812.5432350.0NaN-1.221422-1.4143882.052158-1.217401-1.780807-1.827984-1.583983-0.604312
21671.7757112.5432870.0NaN-1.334541-1.3573550.813515-0.879047-0.053434-1.366216-1.547595NaN
21681.7787703.0654170.0NaN-1.300606-1.442904-1.034297-0.766262-0.797164-1.3662161.4362521.215324
21691.7802993.0654690.0NaN-1.327000-0.872575-0.695870-0.856490-1.121046-0.904447-0.01928320.230516
21701.7818293.0655210.0NaN-1.3119180.524732-0.804166-0.676034-0.4612860.0960510.1626590.487469
21711.7833583.0655730.0NaN-1.300606-1.528454-0.316831-0.2925661.1101440.173012-1.256488-0.513330